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ABSTRACT 

In  this  paper,  we  investigate  the  problem  of  selecting,  from  k(>  2)  m-sided  dice,  the 
fairest  die.  The  fairest  die  is  the  one  corresponding  to  the  smallest  (unknown)  value  of 

m 

6i  =  £  ( pij  —  ~)2,  wh^re  p,-,-  denotes  the  jth  cell  (face)  probability  for  the  ith  die.  The 

;=1*"  .  "  _  " 
proposed  selection  procedures  are  based  on  Schur-convex  functions.  The  problem  is  stud¬ 
ied  in  the  context  of  the  subset  selection  approach.  For  small  samples  case,  a  method  for 
finding  conservative  solutions  for  the  selection  constants  is  given.  Large  sample  approxi¬ 
mations  have  also  been  provided.  A  related  problem  of  selecting  all  good  populations  is 
also  investigated.  A  procedure  for  selecting  the  die  with  the  greatest  bias  is  also  proposed 
and  studied.  Tables  of  constants  necessary  to  carry  out  the  procedure  for  selecting  the 
fairest  die  are  given. 
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1.  INTRODUCTION 

It  frequently  happens  in  problems  concerned  with  ranking  and  selection  that,  whatever 
the  original  formulation  or  purpose  of  the  experiment,  the  actual  outcome  is  the  rejection 
of  certain  processes  and  the  acceptance  of  the  remaining  processes  as  being  superior  with 
respect  to  a  desired  characteristic.  We  shall  try  to  formalize  this  in  the  special  case  when 
the  observations  are  from  multinomial  distributions.  For  example,  when  bets  are  to  be 
placed  on  the  outcomes  of  an  m-sided  die,  we  are  interested  in  the  problem  of  finding 
which  of  the  several  dice  is  the  fairest.  Let  p  =  (pi,. . .  ,  pm)  be  an  unknown  vector,  where 
Pi  denote  the  probability  of  the  ith  outcome  when  we  throw  an  m-sided  die.  How  to 
characterize  and  select  a  fair  die  is  the  main  concern  in  this  kind  of  problem. 

In  practice,  a  Schur-convex  or  Schur-concave  function  of  p  may  be  appropriate.  There 
are  two  measures  of  diversity  of  a  multinomial  population  which  have  been  commonly  used. 
They  are  Shannon’s  entropy  and  Gini-Simpson  index.  The  notion  of  the  entropy  function 
was  introduced  by  Shannon  (1948).  The  Gini-Simpson  index  was  introduced  by  Gini 
(1912)  and  Simpson  (1949).  Both  these  indices  are  Schur-concave  functions  of  p. 

Gupta  and  Huang  (1976)  have  studied  the  problem  of  selecting  the  population  with 
the  largest  entropy  function  when  m  =  2.  Gupta  and  Wong  (1975)  have  considered 
the  problem  of  a  selection  procedure  based  on  a  Schur-concave  function  for  selecting  a 
subset  containing  the  population  with  the  largest  entropy.  Dudewicz  and  Van  der  Meulen 
(1981)  have  studied  a  selection  procedure  based  on  a  generalised  entropy  function.  More 
recently,  Alam,  Mitra,  Rizvi,  and  Saxena  (1986)  have  studied  selection  procedures  based  on 
Shannon’s  entropy  function  and  Gini-Simpson  index  using  the  indifference  zone  approach. 
Rizvi,  Alam,  and  Saxena  (1987)  have  also  considered  a  subset  selection  procedure  based 
on  diversity  indices. 

Foi  m  =  2,  i.e.  the  binomial  case,  Sobel  and  Starr  (1975)  studied  a  selection  procedure 
based  on  the  criterion  |pt-  —  ||.  In  this  paper,  we  discuss  the  general  case  for  m  >  2.  We 

m 

may  use  the  criteria  XT (P»  —  m)2  or  max  Ip*  — “(•  Our  main  goal  is  to  define  (optimal) 

i  =  l 

m 

subset  selection  procedures  based  on  0  =  ^  (p«  —  ^):.  N°te  that  0  is  a  Schur-convex 

»=i 


2 


function  and  is  equivalent  to  the  Gini-Simpson  index.  It  should  be  pointed  out  that  in  our 
paper  we  make  some  improvements  for  the  derivation  of  the  results  of  Rizvi,  Alam,  and 
Saxena  (1987).  Our  proofs  are  stronger  and  more  general.  It  should  be  noted  that  since 
the  majoiization  is  only  a  partial  order  relation,  we  need  to  make  some  assumptions  about 
the  parameter  space. 

Let  7r  i , . . . ,  TTfc  denote  k  dice  with  unknown  probability  vectors  p  ,...,pk  respectively, 

m  —  _ 

where  p.  =  (ptl, . .  .,p,m),  m  >  2,  pl}  >  0,  £  =  1,  t  =  1,. . .  ,k.  We  define 

j  =  i 

m 

Oi  =  <p(p.)  =  J2(p<i  ~  ~)2  (L1) 

j  =  i 

and 

0  =  {t£  =  (pi5p2,...,pfc)}. 

Let  <  ...  <  0 [jt]  denote  the  ordered  values  of  0 i,...,0k-  It  is  assumed  that  the 
exact  pairing  between  the  ordered  parameters  Oi  s  and  the  unordered  0t’s  is  unknown. 
The  unknown  population  associated  with  the  smallest  parameter  0[i]  is  called  the  best 
population.  Our  goal  is  to  define  a  selection  procedure  which  selects  a  non-trivial,  non¬ 
empty  subset  of  {7 rj, . . . ,  nk)  and  satisfies  the  basic  probability  requirement,  that  is, 

inf  P(CS)  >  P*  (1.2) 

where  k~l  <  P*  <  1  and  CS  stands  for  a  correct  selection,  that  is,  the  selection  of  a  subset 
which  includes  the  best  population. 

In  Section  2,  we  formulate  the  problem,  define  the  selection  procedure,  and  study  its 
properties.  In  Section  3,  we  consider  the  problem  of  selecting  all  good  populations.  In 
Section  4,  we  propose  and  study  a  procedure  for  selecting  the  die  with  the  greatest  bias. 
Tables  of  constants  d  =  d(k,n,m,P*)  are  provided  for  m  =  2  and  selected  values  of  k,n 
and  P*. 

2.  SELECTING  THE  FAIREST  DIE 

Suppose  that  we  have  n  independent  observations  from  each  of  the  k  dire.  Let  XtJ 
denote  the  number  of  outcomes  of  the  jth  side  in  the  tth  die.  Then  X_x  =  [Xu, . . . ,  Xim) 
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follows  as  a  multinomial  distribution  with  parameters  n  and  pt  =  {pn,  ■  •  ■  ,Pim)-  We 
will  denote  it  by  X,  ~  M(n,p.).  We  are  interested  in  the  population  associated  with 
the  smallest  parameter  By j,  where  0,  is  defined  by  (l.l).  A  reasonable  estimator  of  dl  is 
Yt  =  Xt)  and  a  natural  selection  procedure  R i  is  proposed  as  follows: 


R 1:  Select  7 r,  if  and  only  if  F,  <  min  Y,  +d ,  where  d  is  the  smallest  non-negative  number 

i<j<k  J 

such  that  the  probability  requirement  (1.2)  is  satisfied. 


In  order  to  find  the  d-value,  which  depends  on  k,  n,  m,  and  P* ,  we  need  some 
lemmas.  For  the  definition  of  majorization  and  basic  properties,  see  Marshall  and  Olkin 
(1979).  In  the  following,  we  will  use  x  <  y  to  mean  that  x  is  majorized  by  y. 

m  ~  — 

Lemma  2.1.  (Rinott  (1973)) 


Let  X  ~  J\/(n,p)  and  (f>(x )  be  a  Schur-convex  (Schur-concave)  function  of  x.  Then 
E[4>{X.))  is  a  Schur-convex  (Schur-concave)  function  of  p. 

Lemma  2.2.  Let  X  ~  M(n,p)  and  rp(x)  be  a  Schur-convex  (Schur-concave)  function  of 
x.  Then  P{ip{^  X)  <  c}  is  a  Schur-concave  (Schur-convex)  function  of  p.  Similarly, 
P{c  <  V>(n  20}  *s  a  Schur-convex  (Schur-concave)  function  of  p. 

Proof.  Define  d>(x)  =  r)<c>’  where  Ia  is  the  indicator  function  of  the  set  A.  Then 

apply  Lemma  2.1. 


Lemma  2.3.  If  t/>(i)  is  a  Schur-convex  (Schur-concave)  function  of  x  and  X  ~  M(n,p). 
Then  P{4>{^  x)  —  d  <  X)}  is  a  Schur-concave  (Schur-convex)  function  of  x  when  p 

is  fixed. 


Proof.  If  x<y,  then  x)  <  y)  and 

ftt 

i4,  (;*)  ~ d  5  *  {iK) } c  (; *)  - d  £  *  (£*) }  - 

Hence 

Theorem  2.4.  P(CS|i?i)  is  a  Schur-concave  function  of  p^  when  all  other  p i  ^  1, 
are  kept  fixed  and  is  a  Schur-convex  function  of  P^y  j  ^  1,  when  all  other  R{ty  1  ±  •?’ 
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are  kept  fixed,  where  p ^  denote  the  probability  vectors  corresponding  to  the  unknown 
population  with  parameter  0[,j  and  statistic  Y^y 

Proof.  P(CS\Rl)  =  P{Y{l)<Y{j)+d,  j  =  2,...,k} 

=  E{P{Y(1)  -  d  <  YU),  j  =  ,k\Y{l)}}. 

Now 


By  Lemma  2.3,  when  P^y  3  ^  15  are  fixed,  then  (2.1)  is  a  product  of  non-negative  Schur- 
concave  functions  of  x  and  hence  a  Schur-concave  function  of  x.  Then,  by  Lemma  2.1, 
P(CS\Ri)  is  a  Schur-concave  function  of  .  Also,  by  Lemma  2.2,  each  term  of  (2.1)  is 
a  Schur-convex  function  of  p^.  Hence  the  result  of  the  other  part  follows. 

Since  majorization  is  only  a  partial  order  relation,  to  simplify  the  problem,  we  may 
assume  that  there  exists  some  i  such  that  p.  <p.,  j  =  1 ,...  ,k,  j  ^  i.  For  our  problem, 

— *  m  —J 

this  assumption  is  reasonable  because  we  expect  that  there  exists  a  fair  die.  The  following 
theorem  provides  the  main  result  of  this  section. 

Theorem  2.5.  Let  f 'll  =  {w  =  (Pj, •••,£*)  €  H|p(l)  <P(j),  j  =  2,..., k}  and  B0  =  {w  = 
(p,...,p)  €  H}.  Then 

inf  P(CS|i?!)  =  inf  P(CS|i?i).  (2.2) 

fli  f20 

Proof.  By  Theorem  2.4  and  the  assumption  p.  .  <p, j  =  2,...,k,  the  infimum  is 

attained  when  p.  .  =  ...  =  p.  .. 

— — \X) 

Although  we  have  found  the  relation  in  (2.2),  we  still  do  not  know  the  exact  point 
p  at  which  the  infimum  is  attained.  For  small  samples  case,  we  consider  a  conditional 
procedure  which  is  similar  to  the  one  proposed  by  Gupta  and  Huang  (1976)  to  overcome 
this  difficulty. 
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In  the  following,  we  may  assume  that  2Li,---,Kk  are  i-i.d.  because  P(CS|i?i)  occurs 


at  n0.  Then 


m)  ~M{nk ,  p). 

V»=i  »=i  / 


For  t  =  £m),  0  <tj<  nk,  j  =  1, . . .  ,m  and  fj  =  let 

y=i 

M{k,  d(t),  t,  m,  n)  ~  E’  (  U  'j  (2.3) 

\5*1 1  •  •  •  i  Sim/ 

where  E*  denotes  the  summation  over  the  set  of  all  ra-tuples  (s,j, . . . ,  s,m)  such  that  0  < 

k  m 

Sij  ^  n,  Z  —  J  —  1,...,  TYl,  y  '  S  ij  —  ,  J  — -  1,...,  TYl,  ^  )  •Stj  n,  Z  1 ,  •  •  •  ,  k 

1=1  j = 1 

and 


£5  (  —  s,  )  <  min  <p  ( —  s,-  )  +  d(t) , 
\n  J  2<j<k  \n  3  J 


for  some  constant  d(t)  depending  on  t.  It  is  easy  to  prove  the  following  lemma. 
Lemma  2.6.  Let  M(k,d{t),  t ,  m,  n)  be  defined  as  in  (2.3).  Then 


^  2£i)  <  2^^  (~  — *)  +d ^  J2Xv=ti'  J  = 

=  M(k,  d{t),  t,  m ,  n)/(  nk  ) 

V  1  j  •  •  •  j  ‘m/ 

is  independent  of  p. 

Using  Lemma  2.6,  we  have  the  following  result. 

Theorem  2.7.  For  given  P*  and  each  t,  let  d(t)  be  the  smallest  number  such  that 


M(k,  d(t),  t ,  m,  n)  > 


>  •  ■  ■  t 


and  let 


d  =  max  d(t), 


\nfPiCS\Ri)  >  P\ 

Ho 
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Proof,  inf  P(CS|jR1) 


=  P{^in^0-2"<^(n-7+7 

=  E  P{^  (»  “O  ElV?7»  7  +  d  E  X‘i  =  i  =  >•■  •  •  'm| 

p  | Y1  X'1  =  tP  j  = 

>  ^2  P  2Ci)  <  2m.infc^>  ^  Xt^  +  <*U)  X‘l  =  tj,  j  =  1,. . .  ,rr 

P  jj^X.y  =  tj,  j  =  1 , . . . ,  m  | 

=  j^Xij  =  fy,  y  =  1,...,m|  M(fc,  d(i),  t,  m,  n)/(^  nk  ^  ^ 


>  P\ 


Remark:  For  small  samples  ( k  and  n  are  both  small),  for  given  P*  and  t,  we  can  easily 
determine  the  smallest  d(t)  satisfying  (2.4).  From  these,  we  have  computed  tables  of  d- 
values  for  m  =  2,  k  =  2(1)7,  n  =  2(1)15,  P*  —  0.75,  0.80,  0.90  and  0.95,  which  are  given 
at  the  end  of  the  paper. 

For  large  samples,  the  above  computation  involves  a  lot  of  computation  time.  Hence, 
in  the  following  large  samples  approximations  are  considered. 

We  know  that  ^  X  is  asymptotically  multivariate  normal  with  mean  vector  p  = 
(p  i, . . .  ,Pm)  and  covariance  matrix  £  =  (<7,y),  where  a  a  =  ^Pi(l-pi)  and  —  —  j^PiPj,  i  £ 
j.  Then  y/n(<p(^  2L)  —  <p(p))  is  asymptotically  normal  with  mean  0  and  variance 
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Theorem  2.8.  For  large  n,  we  have 


inf  P(CS\Ri)  «  inf  [ 
n,  p  J_c 


where  <3?(x)  is  the  cdf  of  the  standard  normal  and  means  approximately  equal. 

Proof.  Let  Zt  =  yj n(<p( ^  Xt)  -  <p(p.))/on(pt),  J  =  l,...,fc,  then  Zt,  i  —  1  are 

asymptotically  i.i.d.  jV(0, 1).  By  Theorem  2.5,  we  have 

\nfPiCS\Ih)  =  inf  JP(CS|.R1) 

Qi  Clo 


=  inf  P  <  Zi  <  Zj  H - 7  c ,  3  =  2, . . . ,  k 

E  l  <MP) 


‘5f/_ 


1  (''x  +  d$(x). 


*»(p) 


Remark:  Rizvi,  Alam,  and  Saxena  (1987)  pointed  out  that 


sup  (p)  =  on  (p°) , 

p 

where  p°  =  (p0,  •  •  ■  ,Po,  1  -  (m  -  l)p0)  and 

5m  -  2  4-  (9m2  -  4m  +  4)2 

Po  =  - 8m(m  —  1) - •  (2J) 

Hence  the  value  d  can  be  found  by  using  the  equation 

iy-ix+^))d*{x)=r-  ,2-8) 

The  integral  (2.8)  has  been  tabulated  by  Bechhoter  (1954),  Gupta  (1963),  and  Gupta, 
Nagel  and  Panchapakesan  (1973). 

If  we  don’t  make  the  assumption  “p.  .  <  p, ,  j  =  2, ...  ,k” ,  we  consider  some  partial 

“V */  m  —  w  ) 

solutions  based  on  some  other  restrictions.  Firstly,  we  consider  the  approach  suggested 
by  Rizvi,  Alam,  and  Saxena  (1987).  For  convenience,  we  assume  that  n\  is  the  best 
population.  Let 

J^P[m_r+ 1]  =  2minfc  I  ^p,|m_r+i]  )  ,  s  =  l,...,m,  (2.9) 

r=  1  ~  -  \r=  1  / 
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where  p,[m_r+i]  is  the  (m  —  r  4-  l)st  smallest  components  of  p..  Then  we  can  determine 
a  vector  p  such  that  p  <  p  ,  i  =  2, . . . ,  k,  and  if  there  is  a  p  <  p  ,  i  =  2, . . .  ,k,  then  p  <  p. 

—  —  m  — 1  ~  m  ~ 1  —  m  ~ 

Since  yj  is  Schur-convex,  we  have  <p(p)  <  <p(pj,  i  —  2, . . . ,  k.  We  will  consider  the  problem 
under  the  parameter  space 


n2  =  {t£  =  <  p(/>)  <  p(£t),  i  =  2, k} . 


We  note  that  if  p  <p  .,  i  =  2 ,...,k,  then  p  <p  and  hence  <p(p  )  <  <p(p).  Hence  the 
parameter  space  fl2  includes  the  parameter  space  considered  by  Gupta  and  Wong  (1975). 

In  the  following,  we  will  give  a  clearer  proof  of  Theorem  4.1  in  the  paper  of  Rizvi, 
Alam,  and  Saxena  (1987). 

Theorem  2.9.  The  infimum  of  P(CS|i?i)  over  Q2  is  attained  when  p.  =  p,  i  =  2, . . . ,  k  and 

'y(Pj)  =  ...  =  <p(pk). 

Proof.  By  Theorem  2.4,  F(CS|i?i)  is  a  Schur-convex  function  of  p(,  i  =  2,...,k.  Let 

P(CS\RX)  =  f{pv  p2,...,pfc),then/(p1,  £2,...,pfc)  >  f{pv  p,...,p).  For  w  =  {pl,---,Pk 

e  n2,  wehave^(pj)  <  <p{p)  <  <p{px),i  =  2,...,k.  Ifp(p,)  <  ^(p),letPj  =  (pi,P2,  •  •  •  ,Pm) 

Pi  <  P2  <  ...  <  Pm.  For  £  >  0,  consider  pe  =  (px  -  e,  p2,...,pm-i,  Pm  +  4  then 

p  <p  .  By  Theorem  2.4  again,  we  have  /(p  ,  p,  ...,p)  >  /(p  ,  p,  ...,p).  Now  take 

(P! PI  )2 -t-2(v=(p)  — -^(p,  ))ll/3  ^  <  \  (~\  j  rl  \  ^ 

£  =  - 2 - -  -  >  0-  Then  <p[pe)  =  < p{p)  and  /(Pj,  - . .  ,pfc)  > 

/(p£,p, . . .  ,p).  This  completes  the  proof  of  the  theorem. 

Remark:  For  large  samples  approximation,  it  is  easy  to  see  that 

jt-i  V*d 


/oo 

-oo 


-.w*+Sw.'"w 


(2.10) 


where  the  infimum  on  the  right  side  of  (2.10)  is  over  all  vectors  p  and  q  for  which  p(p)  = 

vil)- 

In  the  following  we  will  approximate  the  infimum  of  the  probability  of  a  correct  selec¬ 
tion  under  some  restrictions. 

Theorem  2.10.  inf  P(CS\Ri)  cs  inf  P(CS\RX),  where  fig  =  {(p,. . .  ,p)  £  00|p  =  (p, . . .  ,p,g), 
qr  =  l  —  (m  —  l)p}. 
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Proof.  Without  loss  of  generality,  we  will  assume  that  p ^  =  p.,  i 

Pi2  <  ...  <  pim.  We  know  that 

p(1J  <p  .  <p(2) 

— i  m  m  — 1 


i,... 


where 


and 


(i) 


1  Pim  1  Pim 

I  >  •  •  •  i  l  ,  Pim 

m  —  1  m  —  1 


P-2)  =  (p«i,...,Pii,  1  -  (rn  -  l)p,i). 
Let  P(CS\Ri)  =  f{pi,  p2, . . .  ,pfc).  By  Theorem  2.4,  we  have 


Let 


./  x  /  1  -  P  1  -  P  \  (mp  -  l)2 

<P  [P)  =  <P\  - r, ■  •  •  , - 7,  P  =  —f - 77, 

'  rn  —  1  m  -  1  )  m[m  -  1) 


then  ¥?'(p)  is  a  continuous  strictly  increasing  function  of  p  whenever  p  >  ^. 
exist  p*  such  that 

— t 

p(*)  <  p*  <  p(2^ ,  1  =  1, . . . ,  k, 

~l  m  m  — 1 

where  p*  =  (p,-, . . .  ,p,,  <?,)  and  £>(p‘)  =  £>(?,)•  Moreover, 

f  (p(i2)>P21)’---’Pl1))  ^  /(Pl’E2>---»Pfc)  ^  /  (hi1)>£22)’  -  *  *  ’H12)) 

By  Theorem  2.4,  there  exists  a  p*  =  (p, ...,p, <7)  such  that 


p*<p*<p*  i  =  2,...,k, 
“A  m  —  m  — 1 


and 


f{p\,---,pk)  >  /(pV--i£*). 


If  /(£i.P2»”-»Pfc)  ^  /(?*>?*>••■>£*)>  the  result  follows.  Otherwise 

/(P(i2)’£21)'---’Pfc1))  -  /(Pi’P2’---  »£*)  <  /(p*»p*> ••■.£*) 

<  f{p\,P*2,---,Rk)  <  /  (pi1)*E22) . e£2))  ■ 


/c  and  p,i  < 

(2.11) 

(2.12) 


(2.13) 
Hence  there 
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We  may  tolerate  the  difference  between  f{pt,  p2, ...  ,pfc)  and  /(p*,p*, . . .  ,p*)  and  still  use 
this  as  a  lower  bound.  Hence  inf  /?(CS|i?1)  ss  inf  P(CS\Ri). 

Remark:  For  large  n,  we  have 

inf  P(CS|Pi)  «  inf  f  fc*'1  (x  +  d$(z) 

«o  n'oJ-oo  \  <7n{p)J  V' 


where  p°  is  defined  by  (2.7). 

For  m  =  2,  pt  =  (p,,  1  -  pt),  i  =  l,...,fc  and  ^(p{)  =  2(pi  —  |)2.  Since  we  are 
dealing  with  the  problem  concerned  with  Schur-concave  and  Schur-convex  function,  we 
may  assume  that  p,  >  *  =  1, . . . ,  k.  Hence 

<p(pt)  <  <fi(pj)  if  and  only  if  pt  <  py 

and  ff  =  fl i  in  this  case.  By  Theorem  2.5,  we  have 

inf  P(CS|Pi)  =  inf  P(CS|J?i). 

Thus,  the  infimum  of  P(CS|Pi)  is  attained  when  p\  =  . . .  =  p*  =  p.  For  small  samples 
case,  we  can  solve  it  by  using  Theorem  2.7.  For  large  samples  case,  we  solve 


(  x  + 


y/nd  \ 

°n(p°)) 


d$(z)  =  P* 


where  p0  =  (2  +  y/2)/4. 


3.  SELECTING  A  SUBSET  WHICH  CONTAINS  ALL 
GOOD  POPULATIONS 

m 

Let  7 r,  ~  M(n,p.y,  p  =  (p,i, . . .  ,p,m),  0  <  pij  <  1,  Y  Pij  =  1,  *  =  1, — 

j=i 

m 

We  define  7r,  as  a  good  population  if  =  Y  (p«j  —  ^j)2  <  6  and  a  bad  population 

j  =  i 

if  V’Cp^  >  where  0  <  6  <  1  —  ^jis  prespecified.  Our  goal  is  to  define  a  selection 
procedure  which  selects  a  subset  of  {tt i,  . . .  ,  ?r*}  such  that  the  selected  subset  contains  all 


il 


good  populations  with  probability  at  least  P* .  With  the  same  notation  as  that  in  Section 
2,  we  propose  a  natural  selection  procedure  as  follows: 

R2:  Select  tt,  if  and  only  if  X,)  <  c,  where  <5  <  c  is  the  smallest  constant  such  that 

inf  P(CS\R2)  >  P*.  (3.1) 


Let  G  —  {p  =  (pi, . . .  ,pm)|0  <  pi  <  1,  pi  —  1,  <p(p)  <  <5}  denote  the  parameter 

i  =  i 

space  of  good  populations.  We  assume  that  there  are  k i  (unknown)  good  populations, 
l  <  ki  <  k.  Without  loss  of  generality,  we  assume  that  p, , . . .  ,  p,  e  G.  Then  we  have  the 
following  Lemma: 

Lemma  3.1.  Let  X  ~  AL(n,p)  and  g(p)  =  P{i£>(^  X)  <  c},  c  >  0.  Then 


inf  ff(p) 

p€C* 


inf  9{p), 

p€  Cxq 


(3.2) 


where  G0  =  {p  €  G|<^(p)  -  6}. 


Proof.  For  p  €  G,  if  <p(p)  <  S,  we  take 

(Pi  -  Pm)  +  */(pm  -  Pi)2  +2(6  -<p{p)) 

,  = - * - 5 - —  >°’ 


then 


P<P  and  <p(p  )  =  6, 

—  m  — c  — c 


where 


Pe  =  (p  1  -  e,P2,...,Pm-l,Pm  +e),  Pl  <  P2  <  • 
By  Lemma  2.2,  g{p)  is  a  Schur-concave  function  of  p.  Hence 


•  —  Pm- 


g{p)  >  g[p,). 


This  completes  the  proof  of  the  lemma. 


In  order  to  overcome  the  difficulty  of  partial  order  relation,  we  consider  the  parameter 
space  Gi  defined  by 


Gi  =  (P  €  G|p  <  p6,  <p(ps)  <  6}, 

—  —  rn 
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where  p.  is  known  or  unknown. 

— C 

For  the  case  when  p.  is  known,  we  have  g(p)  >  g(pg )  for  all  p  €  Hence 

inf  g(p)  =  g{p  ).  (3.3) 

£t(-r  l 

Further,  we  have  the  following  result: 

Theorem  3.2.  Under  the  parameter  space  Gx,  we  have 

infP(CS|i?2)  >  ( g(p6))k . 

The  value  c  can  be  taken  as  the  solution  to  the  equation 

0(pt)  =  P''n  •  (3.4) 


If  pg  is  unknown,  by  using  the  same  arguments  as  that  in  the  proof  of  Lemma  3.1,  we 
may  assume  that  <fi{p6)  =  S.  For  large  n,  we  have 


ff(p5)  =  p{v{~  20  -  c> 


n 


$ 


y/n(c  -  5) 

°n{p6) 


Hence  the  value  c  can  be  taken  as  the  solution  to  the  equation 


fy/n{c-Jl\  l,k 

V  Mp°)  ) 


(3.5) 


where  p°  is  defined  in  (2.7). 

Also,  for  each  p  =  (pi, . . .  ,pm),  Pi  <  p2  <  •  •  •  <  pm,  we  have  pt1)  <  p  <  p^2\  where 

“  m~  m~ 

i  =  1,2  are  defined  as  in  (2.12).  Given  A  >  0,  we  define 


ga  =  {p  e  g|Mp(2))  -  <p(p) |  <  A}. 


Then  we  have  the  following  result: 
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Lemma ,3^3.  inf  g{p)  >  inf  g'(p),  where  <p' (p)  =  <p  t±JLtP)  and 

p€GA  —  (p)  <«5-f  A  \rn  1  4  1  / 

Proof.  For  pG  GA,  we  may  assume  that  ip(p)  =  6.  Since  p^  <p<p^\  we  have 

*■"  “  m  “  m  “ 

ff(p)  >  ff(p(2))  and  p(p)  <  <P>(p(2)). 

Further,  ,V(p^)  -  ^(p)|  <  A,  so  <p(p^)  <  6  +  A. 

Remark:  g  is  a  Schur-concave  function,  hence 


inf  g{p)>g'{p), 

p€.G  a 


(3.6) 


where  p  =  £  +  ^^(<5  + A). 

Theorem  3.4.  Under  the  parameter  space  Ga,  we  have 

infP(CS|i22)>(g-(p))fc, 


where  p  =  (<5  +  A).  The  value  c  can  be  taken  as  the  solution  to  the  equation 

g'(P)  =  P'Uk. 

For  large  samples,  we  have  the  following  result: 

Theorem  3,5.  For  large  n,  under  the  parameter  space  Ga,  we  have 

-6) 


inf  P(CSjJ?2)  w  $  (V*  ~ — r-P 

V  <mp) 


where  <7*(p)  is  defined  in  (2.5)  and  p  =  •  •  • ,  ~^,p)  ,  P  =  £  +  +  A). 

Proof.  We  may  assume  that  <p(p)  —  6.  Then 

g(p)  «  $ 


Under  <p(p)  =  <5,  cr;J(p)  =  4 


Li  =  1 


(^W)' 

is  a  Schur-convex  function.  Furthermore, 


<72(p)  =  cr£(p*)  for  some  p*  =  (p,  ...,p,  <7).  As  a  function  of  <7,  <j£(p‘)  is  increasing  in  <7. 
Thus 


sup  <(p)  <  <7n(p), 

p€GA 
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where  p  =  (^,  •  •  • ,  ^,p)  ,  P  =  £  + 

When  m  —  2,  <p(pj  <  <5  if  and  only  if  p,  <  |  +  y  2^‘  (Note  that  we  assume  that 
Pi  >  4  again).  Hence,  inf  p(p)  =  g(p°),  where  p°  =  ( |  —  W |  ,  Moreover, 

infP(CS|i?2)  =  ( g(p°))k . 


4.  SELECTING  THE  DIR  WITH  THE  GREATEST  BIAS 

In  this  section,  let  =  ^(p,)  he  as  defined  by  (1.1).  We  are  now  interested  in  the 
largest  parameter  0ju,  that  is,  we  wish  to  select  the  die  with  the  greatest  bias.  Following 
the  same  notation  as  that  in  Section  2,  we  propose  a  natural  selection  procedure  Rz  as 
follows: 

Rz'.  Select  i r,  if  and  only  if  Y,  >  max  Y,  —  d,  where  d  is  the  smallest  non-negative  number 

i<><fc 

such  that  the  probability  requirement  (1.2)  is  satisfied  and  where,  as  before,  Yi  —  <p(-~  X,)- 

Analogous  to  the  proof  of  Lemma  2.3,  we  have  the  following  result. 

Lemma  4.1.  If  rp(x)  is  a  Schur-convex  (Schur-concave)  function  of  x  and  X  ~  M(n,p). 
Then  P{i>{ ^  X)  <  d  +  rj>( £  i)}  is  a  Schur-convex  (Schur-concave)  function  of  x  when  p 
is  fixed. 

If  we  define 


n3  =  {tn  =  G  n| £(<)<£(fc)»  j  =  !>•••.*-  !}• 


(4.1) 


Analogous  to  Theorem  2.4  and  Theorem  2.5,  we  have  the  following  results: 

Theorem  4.2.  P(CS\Rz)  is  a  Schur-convex  function  of  p ^  when  all  other  p^,  i  ^  k,  are 
kept  fixed  and  is  a  Schur-concave  function  of  P^y  J  k,  when  all  other  p.^,  t  ^  j,  are 
kept  fixed. 

Theorem  4.3.  infP(CS|/?3)  =  inf  P(CS|tf3). 

O3  n0 
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For  t  =  tm),  0  <  tj  <  nk ,  j  —  1, ...  ,m  and  tj  =  let 

;  =  i 


M{k,d[t),t,m,n)  =  £”  [j[ 


t=i 


n 


(4.2) 


where  S'*  denotes  the  summation  over  the  set  of  all  m-tuples  (s,i,  •  ■  • ,  sirn)  such  that  0  < 

k  m 

stJ  <  n,  i  =  l,...,k,  j  =  l,...,m,  £  sij  =  tj  >  J  =  £  s,y  =  n,  t  =  1, . . . ,  k 


and 


i=l  ;=1 

1  \  / 1  \  7.  . 

</?  I  —  St.  ]  >  max  ip  (  —  s.  )  —  alt), 
n  J  ~  i<j<k-i  \n  J  w 


for  some  constant  d(t)  depending  on  t.  Analogous  to  Theorem  2.7,  we  have  the  following 
result: 


Theorem  4.4.  For  given  P*  and  each  t ,  let  d(t)  be  the  smallest  number  such  that 

nk 


and  let 


then 


M(k,d(t),t,m,n )  >  (  IP" 

\tli  •  •  •  j  tm/ 


d  =  max  d{t) , 


inf  P(CS|P3)  >  P*. 
^0 


(4.3) 


For  large  samples  approximation,  we  have  the  following  result: 
Theorem  4.5.  For  large  n,  we  have 

infP(CS|P3)  »  inf  [  $k~l  {  x  +  )  d${x). 

o  \  <Mp)  / 


(4.4) 


Remark:  The  value  d  can  be  found  by  using  the  equation  (2.8)  when  d  is  replaced  by  d. 
If  we  don’t  make  the  assumption  “p...  <p.,.,  i  =  l,...,k  —  1”,  we  consider  some 

— v‘J  m  —{*) 

partial  solutions  based  on  some  other  restrictions.  For  convenience,  we  assume  that  nk  is 
the  best  population.  Let 


X^P[m-r+l]  —  1<I^fcC_1  (  XrfP‘[m-r+1! 


,  s  =  l,...,m, 


(4.5) 


r=  1 


\r=l 


16 


where  Pi(m_r+i]  is  the  (m  —  r  +  l)st  smallest  components  of  p  .  Then  we  can  determine  a 
vector  p  such  that  p.  <  p,  t  =  1, . . .  ,fc  —  1,  and  if  there  is  a  p  such  that  p  .  <  p,  i  =  1, . . . ,  k—  1, 

—  — ‘  m~  —  — ‘  m  — 

then  p<p.  Since  is  Schur-convex,  we  have  <p(p.)  <  <p(p),  »  =  l,...,fc  —  1.  We  will 

—  m  ~  — *  — 

consider  the  problem  under  the  parameter  space 


n4  =  {w  =  (p1,---,pjfc)b(p.)  <  <p{p)  <  p(pJ,  *  =  If....*  -  !}•  (4.6) 

We  note  that  if  p  <  p  ,  i  =  1, . . . ,  k  —  1,  then  p  <  p,  and  hence  <p(p)  <  <£>(p  ).  Hence  the 
parameter  space  fl4  includes  the  parameter  space  fl3-  Analogous  to  Theorem  2.9,  we  have 
the  following  result: 

Theorem  4.6.  The  infimum  of  P(CS|i?3)  over  fl4  is  attained  when  p.  =  p,  t  =  1, . . . ,  k  -  1 
and  <p(px)  =  ...  =  <p{pk). 

Proof.  The  only  difference  is  replaced  p  by  p  ,  where 

— €  — € 

P^  =  (Pi  +  C,  P2,...,Pm-l,Pm  -  &) 


and 


e  = 


(Pm  -  Pi)  -  [(pm  -  Pi)2  -  2(^(p  )  -  p(p))J1/2 


Note  that  e  >  0  provided  that  2 e  <  pm  —  pi . 
Remark:  For  large  samples  approximation,  we  have 


infp(CSjiZ3)  «  inf  [  $ 

RdJ-oo 


k~  i  f  ^(p)x  [  Vg* 


d$(x) 


(4.7) 


where  the  infimum  on  the  right  side  of  (4.7)  is  over  all  vectors  p  and  q  for  which  <p{p)  =  <p{q)- 
Analogous  to  Theorem  2.10,  if  we  tolerate  some  loss,  we  may  have  the  following  result. 
Theorem  4.7.  inf  P(CS\Rz)  «  inf  P(CSji?3). 


Remark:  For  large  n,  we  have 


WPfCSIRs) 


where  p°  is  defined  by  (2.7). 
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Table  I.  Table  of  d-values  for  the  procedure  R 


m=2 


k  = 

2 

3 

4 

5 

6 

7 

L=  2 

p  . 75 

.  5000 

.  5000 

.  5000 

.  5000 

.  5000 

.  5000 

.80 

.  5000 

.  5000 

.  5000 

.  5000 

.  5000 

.  5000 

.  90 

.  5000 

.  5000 

.  5000 

.  5000 

.  5000 

.  5000 

.  9  5 

.  5000 

.  5000 

.  5000 

.  5000 

.  5000 

.  5000 

3 

.75 

.  4444 

.  4444 

.  4444 

.  4444 

.4444 

.  4444 

.80 

.4444 

.  4444 

.  4444 

.4444 

.  4444 

.  4444 

.90 

.4444 

.4444 

.  4444 

.4444 

.4444 

.  4444 

.95 

.  4444 

.4444 

.  4444 

.4444 

.4444 

.  4444 

4 

.75 

.  3750 

.  3750 

.  5000 

.  5000 

.  5000 

.  5000 

.80 

.  5000 

.  5000 

.  5000 

.  5000 

.  5000 

.  5000 

.  90 

.  5000 

.  5000 

.  5000 

.  5000 

.  5000 

.  5000 

.95 

.  5000 

.  5000 

.  5000 

.  5000 

.  5000 

.  5000 

5 

.75 

.3200 

.  4800 

.  4800 

.4800 

.  4800 

.4800 

.80 

.  4800 

.  4800 

.  4800 

.4800 

.4800 

.  4800 

.90 

.4800 

.  4800 

.  4800 

.4800 

.  4800 

.4800 

.95 

.4800 

.4800 

.4800 

.  4800 

.4800 

.  4800 

6 

.75 

.  2778 

.4444 

.4444 

.  4444 

.4444 

.  4444 

.80 

.  4444 

.4444 

.4444 

.4444 

.4444 

.  4444 

.90 

.  4444 

.4444 

.4444 

.5000 

.  5000 

.  5000 

.95 

.  5000 

.  5000 

.  5000 

.5000 

.5000 

.  5000 

7 

.75 

.2449 

.  4082 

.  4082 

.4082 

.4082 

.  4082 

.80 

.4082 

.4082 

.  4082 

.4082 

.4082 

.  4082 

.90 

.4082 

.4082 

.  4898 

.  4898 

.4898 

.  4898 

.95 

.4898 

.  4898 

.4898 

.  4898 

.4898 

.4898 

8 

.75 

.2500 

.3750 

.3750 

.3750 

.3750 

.3750 

.80 

.3750 

.3750 

.3750 

.3750 

.3750 

.3750 

.90 

.4688 

.  4688 

.  4688 

.  4688 

.4688 

.4688 

.95 

.4688 

.  4688 

.  4688 

.  4688 

.4688 

.  5000 

9 

.75 

.2469 

.3457 

.3457 

.3457 

.  3457 

.3457 

.80 

.3457 

.3457 

.3457 

.3457 

.  3457 

.  3457 

.90 

.  4444 

.  4444 

.  4444 

.  4444 

.4444 

.  4444 

.  95 

.  4444 

.  4444 

.  4444 

.4938 

.4938 

.4938 

10 

.75 

.2400 

.3200 

.3200 

.3200 

.3200 

.  3200 

.80 

.3200 

.3200 

.  3200 

.3200 

.  3200 

.3200 

.90 

.4200 

.  4200 

.4200 

.4200 

.4200 

.  4200 

.95 

.  4200 

.  4200 

.  4800 

.4800 

.4800 

.  4800 

11 

.75 

.2314 

.2975 

.2975 

.2975 

.  2975 

.2975 

.80 

.2975 

.2975 

.2975 

.2975 

.2975 

.2975 

.90 

.3967 

.3967 

.3967 

.3967 

.3967 

.3967 

.95 

.  3967 

.  3967 

.4628 

.4628 

.4628 

.4628 
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Table  I  (continued). 


m=2 


= 

2 

3 

4 

5 

6 

7 

12 

?*=  .  75 

.  2222 

.  2778 

.2778 

.2778 

.2917 

.2917 

.80 

.  2778 

.  2778 

.2917 

.2917 

.2917 

.2917 

.90 

.  3750 

.  3750 

.  3750 

.  3750 

.  3750 

.  3750 

.  95 

.  3750 

.  3750 

.4444 

.4444 

.  4444 

.4444 

13 

.75 

.  2130 

.2604 

.2604 

.2604 

.2840 

.2840 

.80 

.2604 

.2604 

.2840 

.2840 

.  2840 

.2840 

.  90 

.  3550 

.3550 

.  3550 

.  3550 

.  3550 

.  3550 

.95 

.3550 

.  3550 

.  4260 

.  4260 

.  4260 

.  4260 

14 

.75 

.  2041 

.  2449 

.2449 

.2449 

.  2755 

.  2755 

.80 

.2449 

.2449 

.  2755 

.2755 

.2755 

.2755 

.90 

.  3367 

.  3367 

.3367 

.3367 

.  3367 

.  3367 

.95 

.  3367 

.3367 

.  4082 

.4082 

.  4082 

.  4082 

15 

.75 

.  1956 

.2311 

.2311 

.2311 

.2667 

.2677 

.80 

.  2311 

.2311 

.2667 

.2667 

.2667 

.2677 

.90 

.3200 

.3200 

.3200 

.3200 

.  3200 

.  3200 

.95 

.3200 

.3200 

.3911 

.3911 

.  3911 

.3911 
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